A Hybrid Embedding Approach to Noisy Answer Passage Retrieval

نویسندگان

  • Daniel Cohen
  • W. Bruce Croft
چکیده

Answer passage retrieval is an increasingly important information retrieval task as queries become more precise and mobile and audio interfaces more prevalent. In this task, the goal is to retrieve a contiguous series of sentences (a passage) that concisely addresses the information need expressed in the query. Recent work with deep learning has shown the efficacy of distributed text representations for retrieving sentences or tokens for question answering. However, determining the relevancy of answer passages remains a significant challenge, specifically when there exists a lexical and semantic gap between the text representation used for training and the collection’s vocabulary. In this paper, we demonstrate the flexibility of a character based approach on the task of answer passage retrieval, agnostic to the source of embeddings and with improved performance in P@1 and MRR metrics over a word based approach as the collections degrade in quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

Boosting Passage Retrieval through Reuse in Question Answering

Question Answering (QA) is an emerging important field in Information Retrieval. In a QA system the archive of previous questions asked from the system makes a collection full of useful factual nuggets. This paper makes an initial attempt to investigate the reuse of facts contained in the archive of previous questions to help and gain performance in answering future related factoid questions. I...

متن کامل

Steganography Scheme Based on Reed-Muller Code with Improving Payload and Ability to Retrieval of Destroyed Data for Digital Images

In this paper, a new steganography scheme with high embedding payload and good visual quality is presented. Before embedding process, secret information is encoded as block using Reed-Muller error correction code. After data encoding and embedding into the low-order bits of host image, modulus function is used to increase visual quality of stego image. Since the proposed method is able to embed...

متن کامل

Simple Translation Models for Sentence Retrieval in Factoid Question Answering

Many question-answering systems start with a passage retrieval system to facilitate the answer extraction process. The richer the set of passages, in terms of answer content, the more accurate the answer extraction. We present a simple translation model for passage retrieval at the sentence level. We demonstrate this framework on TREC data, and show that it performs better than retrieval based ...

متن کامل

Answer Passage Retrieval for Question Answering

Document or passage retrieval is typically used as the first step in current question answering systems. The accuracy of the answer that is extracted from the passages and the efficiency of the question answering process will depend to some extent on the quality of this initial ranking. We show how language model approaches can be used to improve answer passage ranking. In particular, we show h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018